Parallelization of local BLAST service on workstation clusters

نویسندگان

  • R. C. Braun
  • Kevin T. Pedretti
  • Thomas L. Casavant
  • Todd E. Scheetz
  • Clayton L. Birkett
  • Chad A. Roberts
چکیده

This paper describes approaches to improve the performance of one of the most common and increasingly important aspects of the Human Genome Project (HGP) — large-volume, batch comparison of DNA sequence data. This basic comparison operation, usually carried out by the well-known BLAST program on one subject sequence against the internationally available databases of nearly five million target sequences, is already used hundreds of thousands of times each day by researchers around the world. At present, it is still used primarily in single query, or small batch query mode. As the entire sequence of the human genome nears completion, the area of functional genomics, and the use of micro-arrays of sets of genes, is coming to the fore. These developments will demand ever more efficient means of BLASTing sets of data that will make single processor implementation on powerful workstations infeasible. We describe the three primary parallel components to BLAST. The first is at the sequence-to-sequence comparison level. The second parallelizes a single query across a partitioned and distributed database. Finally, the set of queries themselves are partitioned across a set of servers with replicated or partitioned databases. The three methods may be employed alone or in concert. Our current implementation is described which parallelizes batch requests, and our plans for implementation of the other levels is also described. The results will ultimately be applied to hardware assistance for this soon-to-be primitive computer operation. © 2001 Elsevier Science B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TurboBLAST(r): A Parallel Implementation of BLAST Built on the TurboHub

BLAST (Basic Local Alignment Search Tool) is by far the most widely used application for rapid screening of large sequence databases. This paper describes TurboBLAST, a parallel implementation of BLAST suitable for execution on networked clusters of heterogeneous PCs, workstations, or Macintosh computers.

متن کامل

Green Destiny + mpiBLAST = Bioinfomagic

This paper outlines how our highly efficient, power-aware supercomputer called Green Destiny and our open-source parallelization of BLAST called mpiBLAST combine to create a bit of “bioinfomagic.” Green Destiny, featured in The New York Times and winner of a 2003 R&D 100 Award, revolutionized high-performance computing by re-defining performance to focus on issues of efficiency, reliability, an...

متن کامل

Parallelizing the Symbolic Manipulation Program FORM Part I: Workstation Clusters & Message Passing

The present paper is the first of a series of papers reporting on the parallelization of the symbolic manipulation program FORM on different parallel architectures. Part I deals with workstation clusters using dedicated network hardware and the messages passing libraries (MPI and PVM). After a short introduction to the sequential version of FORM a detailed analysis of the different platforms us...

متن کامل

Parallel Cluster Labeling on a Network of Workstations1

In recent years, encouraged by today’s fast workstations and by software systems designed to transform workstation clusters into parallel programming environments, network of workstations have been increasingly used as computational engines. Networked workstations,however, are not ideal replacements for supercomputers, because of the low interconnection capacity provided by current local area n...

متن کامل

A Comparison Between Different Parallelization Methods on Workstation Clusters to Solve CFD-Problems

The eecient parallel solution of ow problems on parallel computers requires highly eecient numerical methods as well as highly eecient parallelization methods. Parallelization is mostly done using domain decomposition methods. With the introduction of the so-called combination method a big jump in computational speed is possible. Here, a numerical solution is computed on a so-called sparse grid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2001